Overview

Dataset statistics

Number of variables40
Number of observations60428
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory18.4 MiB
Average record size in memory320.0 B

Variable types

Categorical25
Numeric14
Boolean1

Alerts

brand_name has a high cardinality: 111 distinct values High cardinality
store_sales(in millions) is highly correlated with store_cost(in millions) and 1 other fieldsHigh correlation
store_cost(in millions) is highly correlated with store_sales(in millions) and 1 other fieldsHigh correlation
avg_cars_at home(approx) is highly correlated with avg_cars_at home(approx).1High correlation
avg_cars_at home(approx).1 is highly correlated with avg_cars_at home(approx)High correlation
SRP is highly correlated with store_sales(in millions) and 1 other fieldsHigh correlation
gross_weight is highly correlated with net_weightHigh correlation
net_weight is highly correlated with gross_weightHigh correlation
store_sqft is highly correlated with grocery_sqft and 2 other fieldsHigh correlation
grocery_sqft is highly correlated with store_sqftHigh correlation
frozen_sqft is highly correlated with store_sqft and 1 other fieldsHigh correlation
meat_sqft is highly correlated with store_sqft and 1 other fieldsHigh correlation
coffee_bar is highly correlated with video_store and 3 other fieldsHigh correlation
video_store is highly correlated with coffee_bar and 3 other fieldsHigh correlation
salad_bar is highly correlated with coffee_bar and 3 other fieldsHigh correlation
prepared_food is highly correlated with coffee_bar and 3 other fieldsHigh correlation
florist is highly correlated with coffee_bar and 3 other fieldsHigh correlation
store_sales(in millions) is highly correlated with store_cost(in millions) and 2 other fieldsHigh correlation
store_cost(in millions) is highly correlated with store_sales(in millions) and 1 other fieldsHigh correlation
unit_sales(in millions) is highly correlated with store_sales(in millions)High correlation
avg_cars_at home(approx) is highly correlated with avg_cars_at home(approx).1High correlation
avg_cars_at home(approx).1 is highly correlated with avg_cars_at home(approx)High correlation
SRP is highly correlated with store_sales(in millions) and 1 other fieldsHigh correlation
gross_weight is highly correlated with net_weightHigh correlation
net_weight is highly correlated with gross_weightHigh correlation
store_sqft is highly correlated with grocery_sqft and 2 other fieldsHigh correlation
grocery_sqft is highly correlated with store_sqftHigh correlation
frozen_sqft is highly correlated with store_sqft and 1 other fieldsHigh correlation
meat_sqft is highly correlated with store_sqft and 1 other fieldsHigh correlation
coffee_bar is highly correlated with video_store and 3 other fieldsHigh correlation
video_store is highly correlated with coffee_bar and 3 other fieldsHigh correlation
salad_bar is highly correlated with coffee_bar and 3 other fieldsHigh correlation
prepared_food is highly correlated with coffee_bar and 3 other fieldsHigh correlation
florist is highly correlated with coffee_bar and 3 other fieldsHigh correlation
store_sales(in millions) is highly correlated with store_cost(in millions) and 1 other fieldsHigh correlation
store_cost(in millions) is highly correlated with store_sales(in millions) and 1 other fieldsHigh correlation
avg_cars_at home(approx) is highly correlated with avg_cars_at home(approx).1High correlation
avg_cars_at home(approx).1 is highly correlated with avg_cars_at home(approx)High correlation
SRP is highly correlated with store_sales(in millions) and 1 other fieldsHigh correlation
gross_weight is highly correlated with net_weightHigh correlation
net_weight is highly correlated with gross_weightHigh correlation
store_sqft is highly correlated with grocery_sqft and 2 other fieldsHigh correlation
grocery_sqft is highly correlated with store_sqftHigh correlation
frozen_sqft is highly correlated with store_sqft and 1 other fieldsHigh correlation
meat_sqft is highly correlated with store_sqft and 1 other fieldsHigh correlation
coffee_bar is highly correlated with video_store and 3 other fieldsHigh correlation
video_store is highly correlated with coffee_bar and 3 other fieldsHigh correlation
salad_bar is highly correlated with coffee_bar and 3 other fieldsHigh correlation
prepared_food is highly correlated with coffee_bar and 3 other fieldsHigh correlation
florist is highly correlated with coffee_bar and 3 other fieldsHigh correlation
store_city is highly correlated with store_state and 7 other fieldsHigh correlation
avg. yearly_income is highly correlated with education and 1 other fieldsHigh correlation
store_state is highly correlated with store_city and 7 other fieldsHigh correlation
prepared_food is highly correlated with store_city and 6 other fieldsHigh correlation
education is highly correlated with avg. yearly_incomeHigh correlation
food_category is highly correlated with food_family and 1 other fieldsHigh correlation
avg_cars_at home(approx) is highly correlated with avg_cars_at home(approx).1High correlation
store_type is highly correlated with store_city and 6 other fieldsHigh correlation
florist is highly correlated with store_city and 6 other fieldsHigh correlation
coffee_bar is highly correlated with store_city and 6 other fieldsHigh correlation
avg_cars_at home(approx).1 is highly correlated with avg_cars_at home(approx)High correlation
member_card is highly correlated with avg. yearly_incomeHigh correlation
food_family is highly correlated with food_category and 1 other fieldsHigh correlation
video_store is highly correlated with store_city and 6 other fieldsHigh correlation
sales_country is highly correlated with store_city and 1 other fieldsHigh correlation
salad_bar is highly correlated with store_city and 6 other fieldsHigh correlation
food_department is highly correlated with food_category and 1 other fieldsHigh correlation
food_category is highly correlated with food_department and 2 other fieldsHigh correlation
food_department is highly correlated with food_category and 2 other fieldsHigh correlation
food_family is highly correlated with food_category and 1 other fieldsHigh correlation
store_sales(in millions) is highly correlated with store_cost(in millions) and 2 other fieldsHigh correlation
store_cost(in millions) is highly correlated with store_sales(in millions) and 1 other fieldsHigh correlation
unit_sales(in millions) is highly correlated with store_sales(in millions) and 1 other fieldsHigh correlation
promotion_name is highly correlated with sales_country and 12 other fieldsHigh correlation
sales_country is highly correlated with promotion_name and 6 other fieldsHigh correlation
marital_status is highly correlated with num_children_at_homeHigh correlation
total_children is highly correlated with num_children_at_homeHigh correlation
education is highly correlated with occupation and 3 other fieldsHigh correlation
member_card is highly correlated with avg. yearly_income and 1 other fieldsHigh correlation
occupation is highly correlated with education and 1 other fieldsHigh correlation
avg_cars_at home(approx) is highly correlated with education and 1 other fieldsHigh correlation
avg. yearly_income is highly correlated with education and 2 other fieldsHigh correlation
num_children_at_home is highly correlated with marital_status and 2 other fieldsHigh correlation
avg_cars_at home(approx).1 is highly correlated with education and 1 other fieldsHigh correlation
SRP is highly correlated with store_sales(in millions) and 1 other fieldsHigh correlation
gross_weight is highly correlated with net_weightHigh correlation
net_weight is highly correlated with gross_weightHigh correlation
low_fat is highly correlated with food_category and 1 other fieldsHigh correlation
store_type is highly correlated with promotion_name and 11 other fieldsHigh correlation
store_city is highly correlated with unit_sales(in millions) and 15 other fieldsHigh correlation
store_state is highly correlated with promotion_name and 13 other fieldsHigh correlation
store_sqft is highly correlated with promotion_name and 13 other fieldsHigh correlation
grocery_sqft is highly correlated with promotion_name and 13 other fieldsHigh correlation
frozen_sqft is highly correlated with promotion_name and 12 other fieldsHigh correlation
meat_sqft is highly correlated with promotion_name and 12 other fieldsHigh correlation
coffee_bar is highly correlated with promotion_name and 11 other fieldsHigh correlation
video_store is highly correlated with promotion_name and 11 other fieldsHigh correlation
salad_bar is highly correlated with store_type and 10 other fieldsHigh correlation
prepared_food is highly correlated with store_type and 10 other fieldsHigh correlation
florist is highly correlated with promotion_name and 11 other fieldsHigh correlation
media_type is highly correlated with promotion_name and 2 other fieldsHigh correlation
cost is highly correlated with promotion_name and 5 other fieldsHigh correlation
total_children has 5624 (9.3%) zeros Zeros
num_children_at_home has 37609 (62.2%) zeros Zeros

Reproduction

Analysis started2024-09-15 07:26:49.410721
Analysis finished2024-09-15 07:27:32.699933
Duration43.29 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

food_category
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct45
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size472.2 KiB
Vegetables
7440 
Snack Foods
6919 
Dairy
3835 
Meat
 
3107
Fruit
 
3080
Other values (40)
36047 

Length

Max length20
Median length16
Mean length10.46648904
Min length4

Characters and Unicode

Total characters632469
Distinct characters42
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowBreakfast Foods
2nd rowBreakfast Foods
3rd rowBreakfast Foods
4th rowBreakfast Foods
5th rowBreakfast Foods

Common Values

ValueCountFrequency (%)
Vegetables7440
 
12.3%
Snack Foods6919
 
11.4%
Dairy3835
 
6.3%
Meat3107
 
5.1%
Fruit3080
 
5.1%
Jams and Jellies2550
 
4.2%
Baking Goods1947
 
3.2%
Breakfast Foods1946
 
3.2%
Bread1797
 
3.0%
Canned Soup1722
 
2.8%
Other values (35)26085
43.2%

Length

2024-09-15T12:57:32.769866image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
foods9968
 
10.3%
vegetables7619
 
7.9%
snack6919
 
7.2%
products4667
 
4.8%
and4140
 
4.3%
dairy3835
 
4.0%
meat3107
 
3.2%
fruit3080
 
3.2%
canned3071
 
3.2%
jams2550
 
2.6%
Other values (49)47394
49.2%

Most occurring characters

ValueCountFrequency (%)
e75716
 
12.0%
a59703
 
9.4%
s47208
 
7.5%
o38989
 
6.2%
35922
 
5.7%
r33840
 
5.4%
n33330
 
5.3%
t32034
 
5.1%
d30804
 
4.9%
i26744
 
4.2%
Other values (32)218179
34.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter504337
79.7%
Uppercase Letter92210
 
14.6%
Space Separator35922
 
5.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e75716
15.0%
a59703
11.8%
s47208
9.4%
o38989
 
7.7%
r33840
 
6.7%
n33330
 
6.6%
t32034
 
6.4%
d30804
 
6.1%
i26744
 
5.3%
l21064
 
4.2%
Other values (13)104905
20.8%
Uppercase Letter
ValueCountFrequency (%)
F15089
16.4%
S12995
14.1%
B11265
12.2%
P9401
10.2%
V7619
8.3%
C6932
7.5%
D6859
7.4%
J5860
 
6.4%
M4272
 
4.6%
E3091
 
3.4%
Other values (8)8827
9.6%
Space Separator
ValueCountFrequency (%)
35922
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin596547
94.3%
Common35922
 
5.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e75716
 
12.7%
a59703
 
10.0%
s47208
 
7.9%
o38989
 
6.5%
r33840
 
5.7%
n33330
 
5.6%
t32034
 
5.4%
d30804
 
5.2%
i26744
 
4.5%
l21064
 
3.5%
Other values (31)197115
33.0%
Common
ValueCountFrequency (%)
35922
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII632469
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e75716
 
12.0%
a59703
 
9.4%
s47208
 
7.5%
o38989
 
6.2%
35922
 
5.7%
r33840
 
5.4%
n33330
 
5.3%
t32034
 
5.1%
d30804
 
4.9%
i26744
 
4.2%
Other values (32)218179
34.5%

food_department
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct22
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size472.2 KiB
Produce
8521 
Snack Foods
6919 
Household
6185 
Frozen Foods
6126 
Baking Goods
4497 
Other values (17)
28180 

Length

Max length19
Median length15
Mean length10.10253525
Min length4

Characters and Unicode

Total characters610476
Distinct characters31
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFrozen Foods
2nd rowFrozen Foods
3rd rowFrozen Foods
4th rowFrozen Foods
5th rowFrozen Foods

Common Values

ValueCountFrequency (%)
Produce8521
14.1%
Snack Foods6919
11.4%
Household6185
10.2%
Frozen Foods6126
10.1%
Baking Goods4497
7.4%
Canned Foods4238
7.0%
Dairy3835
 
6.3%
Health and Hygiene3807
 
6.3%
Beverages3014
 
5.0%
Deli2787
 
4.6%
Other values (12)10499
17.4%

Length

2024-09-15T12:57:32.868631image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
foods19164
20.1%
produce8521
 
8.9%
snack6919
 
7.2%
goods6294
 
6.6%
household6185
 
6.5%
frozen6126
 
6.4%
canned4638
 
4.9%
beverages4604
 
4.8%
baking4497
 
4.7%
dairy3835
 
4.0%
Other values (16)24707
25.9%

Most occurring characters

ValueCountFrequency (%)
o83844
13.7%
e58406
 
9.6%
d52152
 
8.5%
s41111
 
6.7%
a40057
 
6.6%
n35970
 
5.9%
35062
 
5.7%
r26563
 
4.4%
F25290
 
4.1%
c23017
 
3.8%
Other values (21)189004
31.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter483731
79.2%
Uppercase Letter91683
 
15.0%
Space Separator35062
 
5.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o83844
17.3%
e58406
12.1%
d52152
10.8%
s41111
8.5%
a40057
8.3%
n35970
7.4%
r26563
 
5.5%
c23017
 
4.8%
i18458
 
3.8%
l17155
 
3.5%
Other values (9)86998
18.0%
Uppercase Letter
ValueCountFrequency (%)
F25290
27.6%
H13799
15.1%
B11676
12.7%
S9935
 
10.8%
P9892
 
10.8%
D6622
 
7.2%
G6294
 
6.9%
C5248
 
5.7%
A1590
 
1.7%
E952
 
1.0%
Space Separator
ValueCountFrequency (%)
35062
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin575414
94.3%
Common35062
 
5.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
o83844
14.6%
e58406
 
10.2%
d52152
 
9.1%
s41111
 
7.1%
a40057
 
7.0%
n35970
 
6.3%
r26563
 
4.6%
F25290
 
4.4%
c23017
 
4.0%
i18458
 
3.2%
Other values (20)170546
29.6%
Common
ValueCountFrequency (%)
35062
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII610476
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o83844
13.7%
e58406
 
9.6%
d52152
 
8.5%
s41111
 
6.7%
a40057
 
6.6%
n35970
 
5.9%
35062
 
5.7%
r26563
 
4.4%
F25290
 
4.1%
c23017
 
3.8%
Other values (21)189004
31.0%

food_family
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size472.2 KiB
Food
43284 
Non-Consumable
11573 
Drink
5571 

Length

Max length14
Median length4
Mean length6.007364136
Min length4

Characters and Unicode

Total characters363013
Distinct characters18
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFood
2nd rowFood
3rd rowFood
4th rowFood
5th rowFood

Common Values

ValueCountFrequency (%)
Food43284
71.6%
Non-Consumable11573
 
19.2%
Drink5571
 
9.2%

Length

2024-09-15T12:57:32.958408image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2024-09-15T12:57:33.043180image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
food43284
71.6%
non-consumable11573
 
19.2%
drink5571
 
9.2%

Most occurring characters

ValueCountFrequency (%)
o109714
30.2%
F43284
 
11.9%
d43284
 
11.9%
n28717
 
7.9%
e11573
 
3.2%
l11573
 
3.2%
b11573
 
3.2%
a11573
 
3.2%
m11573
 
3.2%
u11573
 
3.2%
Other values (8)68576
18.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter279439
77.0%
Uppercase Letter72001
 
19.8%
Dash Punctuation11573
 
3.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o109714
39.3%
d43284
 
15.5%
n28717
 
10.3%
e11573
 
4.1%
l11573
 
4.1%
b11573
 
4.1%
a11573
 
4.1%
m11573
 
4.1%
u11573
 
4.1%
s11573
 
4.1%
Other values (3)16713
 
6.0%
Uppercase Letter
ValueCountFrequency (%)
F43284
60.1%
C11573
 
16.1%
N11573
 
16.1%
D5571
 
7.7%
Dash Punctuation
ValueCountFrequency (%)
-11573
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin351440
96.8%
Common11573
 
3.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
o109714
31.2%
F43284
 
12.3%
d43284
 
12.3%
n28717
 
8.2%
e11573
 
3.3%
l11573
 
3.3%
b11573
 
3.3%
a11573
 
3.3%
m11573
 
3.3%
u11573
 
3.3%
Other values (7)57003
16.2%
Common
ValueCountFrequency (%)
-11573
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII363013
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o109714
30.2%
F43284
 
11.9%
d43284
 
11.9%
n28717
 
7.9%
e11573
 
3.2%
l11573
 
3.2%
b11573
 
3.2%
a11573
 
3.2%
m11573
 
3.2%
u11573
 
3.2%
Other values (8)68576
18.9%

store_sales(in millions)
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1033
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.541030648
Minimum0.51
Maximum22.92
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size472.2 KiB
2024-09-15T12:57:33.135932image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0.51
5-th percentile1.8
Q13.81
median5.94
Q38.67
95-th percentile13.08
Maximum22.92
Range22.41
Interquartile range (IQR)4.86

Descriptive statistics

Standard deviation3.463046547
Coefficient of variation (CV)0.5294343864
Kurtosis0.09300234333
Mean6.541030648
Median Absolute Deviation (MAD)2.38
Skewness0.6783827681
Sum395261.4
Variance11.99269139
MonotonicityNot monotonic
2024-09-15T12:57:33.239639image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5.04311
 
0.5%
7.95272
 
0.5%
5.4270
 
0.4%
4.8268
 
0.4%
7.41265
 
0.4%
5.52261
 
0.4%
6.84257
 
0.4%
8.52248
 
0.4%
3.6244
 
0.4%
2.28243
 
0.4%
Other values (1023)57789
95.6%
ValueCountFrequency (%)
0.512
< 0.1%
0.523
< 0.1%
0.533
< 0.1%
0.541
 
< 0.1%
0.552
< 0.1%
0.561
 
< 0.1%
0.574
< 0.1%
0.583
< 0.1%
0.63
< 0.1%
0.612
< 0.1%
ValueCountFrequency (%)
22.921
 
< 0.1%
19.95
 
< 0.1%
19.853
 
< 0.1%
19.83
 
< 0.1%
19.7519
< 0.1%
19.75
 
< 0.1%
19.652
 
< 0.1%
19.63
 
< 0.1%
19.555
 
< 0.1%
19.53
 
< 0.1%

store_cost(in millions)
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct9919
Distinct (%)16.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.619459501
Minimum0.1632
Maximum9.7265
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size472.2 KiB
2024-09-15T12:57:33.362716image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0.1632
5-th percentile0.70937
Q11.5
median2.3856
Q33.484025
95-th percentile5.34452
Maximum9.7265
Range9.5633
Interquartile range (IQR)1.984025

Descriptive statistics

Standard deviation1.453008709
Coefficient of variation (CV)0.5546979096
Kurtosis0.5422317754
Mean2.619459501
Median Absolute Deviation (MAD)0.97585
Skewness0.8329204966
Sum158288.6987
Variance2.111234309
MonotonicityNot monotonic
2024-09-15T12:57:33.474420image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1.51271
 
0.1%
2.1665
 
0.1%
3.02464
 
0.1%
2.59262
 
0.1%
1.72862
 
0.1%
2.01661
 
0.1%
1.58458
 
0.1%
2.05258
 
0.1%
2.35257
 
0.1%
1.36857
 
0.1%
Other values (9909)59813
99.0%
ValueCountFrequency (%)
0.16321
< 0.1%
0.17051
< 0.1%
0.1761
< 0.1%
0.17921
< 0.1%
0.1861
< 0.1%
0.19532
< 0.1%
0.20131
< 0.1%
0.20141
< 0.1%
0.20281
< 0.1%
0.20522
< 0.1%
ValueCountFrequency (%)
9.72651
< 0.1%
9.53051
< 0.1%
9.5251
< 0.1%
9.5041
< 0.1%
9.4081
< 0.1%
9.3841
< 0.1%
9.33451
< 0.1%
9.28252
< 0.1%
9.21
< 0.1%
9.18851
< 0.1%

unit_sales(in millions)
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.09316873
Minimum1
Maximum6
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size472.2 KiB
2024-09-15T12:57:33.559775image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q13
median3
Q34
95-th percentile4
Maximum6
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.8276769106
Coefficient of variation (CV)0.2675822055
Kurtosis-0.31897095
Mean3.09316873
Median Absolute Deviation (MAD)1
Skewness0.05250417485
Sum186914
Variance0.6850490683
MonotonicityNot monotonic
2024-09-15T12:57:33.629728image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
327482
45.5%
416581
27.4%
213417
22.2%
52058
 
3.4%
1864
 
1.4%
626
 
< 0.1%
ValueCountFrequency (%)
1864
 
1.4%
213417
22.2%
327482
45.5%
416581
27.4%
52058
 
3.4%
626
 
< 0.1%
ValueCountFrequency (%)
626
 
< 0.1%
52058
 
3.4%
416581
27.4%
327482
45.5%
213417
22.2%
1864
 
1.4%

promotion_name
Categorical

HIGH CORRELATION

Distinct49
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size472.2 KiB
Weekend Markdown
 
2330
Two Day Sale
 
2321
Price Savers
 
2279
Price Winners
 
2108
Save-It Sale
 
2001
Other values (44)
49389 

Length

Max length23
Median length21
Mean length14.17084795
Min length9

Characters and Unicode

Total characters856316
Distinct characters43
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowBag Stuffers
2nd rowCash Register Lottery
3rd rowHigh Roller Savings
4th rowCash Register Lottery
5th rowDouble Down Sale

Common Values

ValueCountFrequency (%)
Weekend Markdown2330
 
3.9%
Two Day Sale2321
 
3.8%
Price Savers2279
 
3.8%
Price Winners2108
 
3.5%
Save-It Sale2001
 
3.3%
Super Duper Savers1986
 
3.3%
Super Savers1930
 
3.2%
One Day Sale1843
 
3.0%
Double Down Sale1755
 
2.9%
High Roller Savings1741
 
2.9%
Other values (39)40134
66.4%

Length

2024-09-15T12:57:33.717486image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
sale10313
 
6.8%
price10173
 
6.7%
savers9898
 
6.5%
days7459
 
4.9%
savings6401
 
4.2%
for5677
 
3.7%
one4378
 
2.9%
super4305
 
2.8%
day4164
 
2.7%
two3814
 
2.5%
Other values (58)85647
56.3%

Most occurring characters

ValueCountFrequency (%)
e105202
 
12.3%
91801
 
10.7%
r65194
 
7.6%
a64785
 
7.6%
s49554
 
5.8%
S44145
 
5.2%
i40668
 
4.7%
l38186
 
4.5%
o35283
 
4.1%
n34382
 
4.0%
Other values (33)287116
33.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter610819
71.3%
Uppercase Letter151695
 
17.7%
Space Separator91801
 
10.7%
Dash Punctuation2001
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e105202
17.2%
r65194
10.7%
a64785
10.6%
s49554
 
8.1%
i40668
 
6.7%
l38186
 
6.3%
o35283
 
5.8%
n34382
 
5.6%
t26454
 
4.3%
v21700
 
3.6%
Other values (12)129411
21.2%
Uppercase Letter
ValueCountFrequency (%)
S44145
29.1%
D23433
15.4%
P12221
 
8.1%
B8925
 
5.9%
T8805
 
5.8%
C7044
 
4.6%
W6345
 
4.2%
G5929
 
3.9%
O5528
 
3.6%
I5466
 
3.6%
Other values (9)23854
15.7%
Space Separator
ValueCountFrequency (%)
91801
100.0%
Dash Punctuation
ValueCountFrequency (%)
-2001
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin762514
89.0%
Common93802
 
11.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e105202
13.8%
r65194
 
8.5%
a64785
 
8.5%
s49554
 
6.5%
S44145
 
5.8%
i40668
 
5.3%
l38186
 
5.0%
o35283
 
4.6%
n34382
 
4.5%
t26454
 
3.5%
Other values (31)258661
33.9%
Common
ValueCountFrequency (%)
91801
97.9%
-2001
 
2.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII856316
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e105202
 
12.3%
91801
 
10.7%
r65194
 
7.6%
a64785
 
7.6%
s49554
 
5.8%
S44145
 
5.2%
i40668
 
4.7%
l38186
 
4.5%
o35283
 
4.1%
n34382
 
4.0%
Other values (33)287116
33.5%

sales_country
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size472.2 KiB
USA
38892 
Mexico
17572 
Canada
3964 

Length

Max length6
Median length3
Mean length4.069173231
Min length3

Characters and Unicode

Total characters245892
Distinct characters13
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUSA
2nd rowUSA
3rd rowUSA
4th rowUSA
5th rowUSA

Common Values

ValueCountFrequency (%)
USA38892
64.4%
Mexico17572
29.1%
Canada3964
 
6.6%

Length

2024-09-15T12:57:33.810279image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2024-09-15T12:57:33.899042image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
usa38892
64.4%
mexico17572
29.1%
canada3964
 
6.6%

Most occurring characters

ValueCountFrequency (%)
U38892
15.8%
S38892
15.8%
A38892
15.8%
M17572
7.1%
e17572
7.1%
x17572
7.1%
i17572
7.1%
c17572
7.1%
o17572
7.1%
a11892
 
4.8%
Other values (3)11892
 
4.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter138212
56.2%
Lowercase Letter107680
43.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e17572
16.3%
x17572
16.3%
i17572
16.3%
c17572
16.3%
o17572
16.3%
a11892
11.0%
n3964
 
3.7%
d3964
 
3.7%
Uppercase Letter
ValueCountFrequency (%)
U38892
28.1%
S38892
28.1%
A38892
28.1%
M17572
12.7%
C3964
 
2.9%

Most occurring scripts

ValueCountFrequency (%)
Latin245892
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
U38892
15.8%
S38892
15.8%
A38892
15.8%
M17572
7.1%
e17572
7.1%
x17572
7.1%
i17572
7.1%
c17572
7.1%
o17572
7.1%
a11892
 
4.8%
Other values (3)11892
 
4.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII245892
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
U38892
15.8%
S38892
15.8%
A38892
15.8%
M17572
7.1%
e17572
7.1%
x17572
7.1%
i17572
7.1%
c17572
7.1%
o17572
7.1%
a11892
 
4.8%
Other values (3)11892
 
4.8%

marital_status
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size472.2 KiB
S
30355 
M
30073 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters60428
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowM
2nd rowM
3rd rowS
4th rowM
5th rowM

Common Values

ValueCountFrequency (%)
S30355
50.2%
M30073
49.8%

Length

2024-09-15T12:57:33.972830image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2024-09-15T12:57:34.054586image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
s30355
50.2%
m30073
49.8%

Most occurring characters

ValueCountFrequency (%)
S30355
50.2%
M30073
49.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter60428
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S30355
50.2%
M30073
49.8%

Most occurring scripts

ValueCountFrequency (%)
Latin60428
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S30355
50.2%
M30073
49.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII60428
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S30355
50.2%
M30073
49.8%

gender
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size472.2 KiB
F
30942 
M
29486 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters60428
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowF
2nd rowM
3rd rowF
4th rowF
5th rowM

Common Values

ValueCountFrequency (%)
F30942
51.2%
M29486
48.8%

Length

2024-09-15T12:57:34.125507image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2024-09-15T12:57:34.204864image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
f30942
51.2%
m29486
48.8%

Most occurring characters

ValueCountFrequency (%)
F30942
51.2%
M29486
48.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter60428
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
F30942
51.2%
M29486
48.8%

Most occurring scripts

ValueCountFrequency (%)
Latin60428
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
F30942
51.2%
M29486
48.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII60428
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
F30942
51.2%
M29486
48.8%

total_children
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.533875025
Minimum0
Maximum5
Zeros5624
Zeros (%)9.3%
Negative0
Negative (%)0.0%
Memory size472.2 KiB
2024-09-15T12:57:34.266664image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median3
Q34
95-th percentile5
Maximum5
Range5
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.490164786
Coefficient of variation (CV)0.5880971918
Kurtosis-1.03956372
Mean2.533875025
Median Absolute Deviation (MAD)1
Skewness-0.01498392606
Sum153117
Variance2.22059109
MonotonicityNot monotonic
2024-09-15T12:57:34.336672image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
212518
20.7%
412427
20.6%
311921
19.7%
111770
19.5%
56168
10.2%
05624
9.3%
ValueCountFrequency (%)
05624
9.3%
111770
19.5%
212518
20.7%
311921
19.7%
412427
20.6%
56168
10.2%
ValueCountFrequency (%)
56168
10.2%
412427
20.6%
311921
19.7%
212518
20.7%
111770
19.5%
05624
9.3%

education
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size472.2 KiB
Partial High School
18201 
High School Degree
17838 
Bachelors Degree
15994 
Partial College
5284 
Graduate Degree
3111 

Length

Max length19
Median length18
Mean length17.35506719
Min length15

Characters and Unicode

Total characters1048732
Distinct characters21
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPartial High School
2nd rowBachelors Degree
3rd rowPartial High School
4th rowHigh School Degree
5th rowPartial High School

Common Values

ValueCountFrequency (%)
Partial High School18201
30.1%
High School Degree17838
29.5%
Bachelors Degree15994
26.5%
Partial College5284
 
8.7%
Graduate Degree3111
 
5.1%

Length

2024-09-15T12:57:34.424788image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2024-09-15T12:57:34.526515image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
degree36943
23.5%
high36039
23.0%
school36039
23.0%
partial23485
15.0%
bachelors15994
10.2%
college5284
 
3.4%
graduate3111
 
2.0%

Most occurring characters

ValueCountFrequency (%)
e140502
13.4%
96467
9.2%
o93356
8.9%
h88072
8.4%
l86086
8.2%
r79533
 
7.6%
g78266
 
7.5%
a69186
 
6.6%
i59524
 
5.7%
c52033
 
5.0%
Other values (11)205707
19.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter795370
75.8%
Uppercase Letter156895
 
15.0%
Space Separator96467
 
9.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e140502
17.7%
o93356
11.7%
h88072
11.1%
l86086
10.8%
r79533
10.0%
g78266
9.8%
a69186
8.7%
i59524
7.5%
c52033
 
6.5%
t26596
 
3.3%
Other values (3)22216
 
2.8%
Uppercase Letter
ValueCountFrequency (%)
D36943
23.5%
S36039
23.0%
H36039
23.0%
P23485
15.0%
B15994
10.2%
C5284
 
3.4%
G3111
 
2.0%
Space Separator
ValueCountFrequency (%)
96467
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin952265
90.8%
Common96467
 
9.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e140502
14.8%
o93356
9.8%
h88072
9.2%
l86086
9.0%
r79533
8.4%
g78266
8.2%
a69186
7.3%
i59524
 
6.3%
c52033
 
5.5%
D36943
 
3.9%
Other values (10)168764
17.7%
Common
ValueCountFrequency (%)
96467
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1048732
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e140502
13.4%
96467
9.2%
o93356
8.9%
h88072
8.4%
l86086
8.2%
r79533
 
7.6%
g78266
 
7.5%
a69186
 
6.6%
i59524
 
5.7%
c52033
 
5.0%
Other values (11)205707
19.6%

member_card
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size472.2 KiB
Bronze
33807 
Normal
13867 
Golden
7556 
Silver
5198 

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters362568
Distinct characters15
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNormal
2nd rowSilver
3rd rowNormal
4th rowBronze
5th rowBronze

Common Values

ValueCountFrequency (%)
Bronze33807
55.9%
Normal13867
22.9%
Golden7556
 
12.5%
Silver5198
 
8.6%

Length

2024-09-15T12:57:34.633074image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2024-09-15T12:57:34.720843image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
bronze33807
55.9%
normal13867
22.9%
golden7556
 
12.5%
silver5198
 
8.6%

Most occurring characters

ValueCountFrequency (%)
o55230
15.2%
r52872
14.6%
e46561
12.8%
n41363
11.4%
B33807
9.3%
z33807
9.3%
l26621
7.3%
N13867
 
3.8%
m13867
 
3.8%
a13867
 
3.8%
Other values (5)30706
8.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter302140
83.3%
Uppercase Letter60428
 
16.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o55230
18.3%
r52872
17.5%
e46561
15.4%
n41363
13.7%
z33807
11.2%
l26621
8.8%
m13867
 
4.6%
a13867
 
4.6%
d7556
 
2.5%
i5198
 
1.7%
Uppercase Letter
ValueCountFrequency (%)
B33807
55.9%
N13867
22.9%
G7556
 
12.5%
S5198
 
8.6%

Most occurring scripts

ValueCountFrequency (%)
Latin362568
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o55230
15.2%
r52872
14.6%
e46561
12.8%
n41363
11.4%
B33807
9.3%
z33807
9.3%
l26621
7.3%
N13867
 
3.8%
m13867
 
3.8%
a13867
 
3.8%
Other values (5)30706
8.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII362568
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o55230
15.2%
r52872
14.6%
e46561
12.8%
n41363
11.4%
B33807
9.3%
z33807
9.3%
l26621
7.3%
N13867
 
3.8%
m13867
 
3.8%
a13867
 
3.8%
Other values (5)30706
8.5%

occupation
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size472.2 KiB
Professional
19915 
Skilled Manual
15995 
Manual
14624 
Management
8805 
Clerical
 
1089

Length

Max length14
Median length12
Mean length10.71384127
Min length6

Characters and Unicode

Total characters647416
Distinct characters21
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSkilled Manual
2nd rowProfessional
3rd rowManual
4th rowManual
5th rowSkilled Manual

Common Values

ValueCountFrequency (%)
Professional19915
33.0%
Skilled Manual15995
26.5%
Manual14624
24.2%
Management8805
14.6%
Clerical1089
 
1.8%

Length

2024-09-15T12:57:34.813218image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2024-09-15T12:57:34.911560image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
manual30619
40.1%
professional19915
26.1%
skilled15995
20.9%
management8805
 
11.5%
clerical1089
 
1.4%

Most occurring characters

ValueCountFrequency (%)
a99852
15.4%
l84702
13.1%
n68144
10.5%
e54609
8.4%
o39830
 
6.2%
s39830
 
6.2%
M39424
 
6.1%
i36999
 
5.7%
u30619
 
4.7%
r21004
 
3.2%
Other values (11)132403
20.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter554998
85.7%
Uppercase Letter76423
 
11.8%
Space Separator15995
 
2.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a99852
18.0%
l84702
15.3%
n68144
12.3%
e54609
9.8%
o39830
 
7.2%
s39830
 
7.2%
i36999
 
6.7%
u30619
 
5.5%
r21004
 
3.8%
f19915
 
3.6%
Other values (6)59494
10.7%
Uppercase Letter
ValueCountFrequency (%)
M39424
51.6%
P19915
26.1%
S15995
20.9%
C1089
 
1.4%
Space Separator
ValueCountFrequency (%)
15995
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin631421
97.5%
Common15995
 
2.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
a99852
15.8%
l84702
13.4%
n68144
10.8%
e54609
8.6%
o39830
 
6.3%
s39830
 
6.3%
M39424
 
6.2%
i36999
 
5.9%
u30619
 
4.8%
r21004
 
3.3%
Other values (10)116408
18.4%
Common
ValueCountFrequency (%)
15995
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII647416
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a99852
15.4%
l84702
13.1%
n68144
10.5%
e54609
8.4%
o39830
 
6.2%
s39830
 
6.2%
M39424
 
6.1%
i36999
 
5.7%
u30619
 
4.7%
r21004
 
3.2%
Other values (11)132403
20.5%

houseowner
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size59.1 KiB
True
36510 
False
23918 
ValueCountFrequency (%)
True36510
60.4%
False23918
39.6%
2024-09-15T12:57:35.004313image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

avg_cars_at home(approx)
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size472.2 KiB
2.0
18268 
3.0
16961 
1.0
13643 
4.0
7974 
0.0
3582 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters181284
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row4.0
3rd row1.0
4th row2.0
5th row2.0

Common Values

ValueCountFrequency (%)
2.018268
30.2%
3.016961
28.1%
1.013643
22.6%
4.07974
13.2%
0.03582
 
5.9%

Length

2024-09-15T12:57:35.075117image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2024-09-15T12:57:35.164209image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
2.018268
30.2%
3.016961
28.1%
1.013643
22.6%
4.07974
13.2%
0.03582
 
5.9%

Most occurring characters

ValueCountFrequency (%)
064010
35.3%
.60428
33.3%
218268
 
10.1%
316961
 
9.4%
113643
 
7.5%
47974
 
4.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number120856
66.7%
Other Punctuation60428
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
064010
53.0%
218268
 
15.1%
316961
 
14.0%
113643
 
11.3%
47974
 
6.6%
Other Punctuation
ValueCountFrequency (%)
.60428
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common181284
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
064010
35.3%
.60428
33.3%
218268
 
10.1%
316961
 
9.4%
113643
 
7.5%
47974
 
4.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII181284
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
064010
35.3%
.60428
33.3%
218268
 
10.1%
316961
 
9.4%
113643
 
7.5%
47974
 
4.4%

avg. yearly_income
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size472.2 KiB
$30K - $50K
19514 
$10K - $30K
12959 
$50K - $70K
10493 
$70K - $90K
7544 
$130K - $150K
3410 
Other values (3)
6508 

Length

Max length13
Median length11
Mean length11.16570133
Min length7

Characters and Unicode

Total characters674721
Distinct characters11
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row$10K - $30K
2nd row$50K - $70K
3rd row$10K - $30K
4th row$30K - $50K
5th row$30K - $50K

Common Values

ValueCountFrequency (%)
$30K - $50K19514
32.3%
$10K - $30K12959
21.4%
$50K - $70K10493
17.4%
$70K - $90K7544
 
12.5%
$130K - $150K3410
 
5.6%
$90K - $110K2737
 
4.5%
$110K - $130K2590
 
4.3%
$150K +1181
 
2.0%

Length

2024-09-15T12:57:35.253199image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2024-09-15T12:57:35.372912image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
60428
33.6%
30k32473
18.0%
50k30007
16.7%
70k18037
 
10.0%
10k12959
 
7.2%
90k10281
 
5.7%
130k6000
 
3.3%
110k5327
 
3.0%
150k4591
 
2.5%

Most occurring characters

ValueCountFrequency (%)
$119675
17.7%
0119675
17.7%
K119675
17.7%
119675
17.7%
-59247
8.8%
338473
 
5.7%
534598
 
5.1%
134204
 
5.1%
718037
 
2.7%
910281
 
1.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number255268
37.8%
Currency Symbol119675
17.7%
Uppercase Letter119675
17.7%
Space Separator119675
17.7%
Dash Punctuation59247
 
8.8%
Math Symbol1181
 
0.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0119675
46.9%
338473
 
15.1%
534598
 
13.6%
134204
 
13.4%
718037
 
7.1%
910281
 
4.0%
Currency Symbol
ValueCountFrequency (%)
$119675
100.0%
Uppercase Letter
ValueCountFrequency (%)
K119675
100.0%
Space Separator
ValueCountFrequency (%)
119675
100.0%
Dash Punctuation
ValueCountFrequency (%)
-59247
100.0%
Math Symbol
ValueCountFrequency (%)
+1181
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common555046
82.3%
Latin119675
 
17.7%

Most frequent character per script

Common
ValueCountFrequency (%)
$119675
21.6%
0119675
21.6%
119675
21.6%
-59247
10.7%
338473
 
6.9%
534598
 
6.2%
134204
 
6.2%
718037
 
3.2%
910281
 
1.9%
+1181
 
0.2%
Latin
ValueCountFrequency (%)
K119675
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII674721
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
$119675
17.7%
0119675
17.7%
K119675
17.7%
119675
17.7%
-59247
8.8%
338473
 
5.7%
534598
 
5.1%
134204
 
5.1%
718037
 
2.7%
910281
 
1.5%

num_children_at_home
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.8293506322
Minimum0
Maximum5
Zeros37609
Zeros (%)62.2%
Negative0
Negative (%)0.0%
Memory size472.2 KiB
2024-09-15T12:57:35.504927image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile4
Maximum5
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.303423873
Coefficient of variation (CV)1.571619798
Kurtosis1.467237541
Mean0.8293506322
Median Absolute Deviation (MAD)0
Skewness1.554279648
Sum50116
Variance1.698913792
MonotonicityNot monotonic
2024-09-15T12:57:35.578128image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
037609
62.2%
18811
 
14.6%
25841
 
9.7%
34391
 
7.3%
42430
 
4.0%
51346
 
2.2%
ValueCountFrequency (%)
037609
62.2%
18811
 
14.6%
25841
 
9.7%
34391
 
7.3%
42430
 
4.0%
51346
 
2.2%
ValueCountFrequency (%)
51346
 
2.2%
42430
 
4.0%
34391
 
7.3%
25841
 
9.7%
18811
 
14.6%
037609
62.2%

avg_cars_at home(approx).1
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size472.2 KiB
2.0
18268 
3.0
16961 
1.0
13643 
4.0
7974 
0.0
3582 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters181284
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row4.0
3rd row1.0
4th row2.0
5th row2.0

Common Values

ValueCountFrequency (%)
2.018268
30.2%
3.016961
28.1%
1.013643
22.6%
4.07974
13.2%
0.03582
 
5.9%

Length

2024-09-15T12:57:35.658045image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2024-09-15T12:57:35.745810image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
2.018268
30.2%
3.016961
28.1%
1.013643
22.6%
4.07974
13.2%
0.03582
 
5.9%

Most occurring characters

ValueCountFrequency (%)
064010
35.3%
.60428
33.3%
218268
 
10.1%
316961
 
9.4%
113643
 
7.5%
47974
 
4.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number120856
66.7%
Other Punctuation60428
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
064010
53.0%
218268
 
15.1%
316961
 
14.0%
113643
 
11.3%
47974
 
6.6%
Other Punctuation
ValueCountFrequency (%)
.60428
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common181284
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
064010
35.3%
.60428
33.3%
218268
 
10.1%
316961
 
9.4%
113643
 
7.5%
47974
 
4.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII181284
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
064010
35.3%
.60428
33.3%
218268
 
10.1%
316961
 
9.4%
113643
 
7.5%
47974
 
4.4%

brand_name
Categorical

HIGH CARDINALITY

Distinct111
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size472.2 KiB
Hermanos
 
1839
Ebony
 
1729
Tell Tale
 
1728
Tri-State
 
1633
High Top
 
1592
Other values (106)
51907 

Length

Max length13
Median length10
Mean length7.521993116
Min length3

Characters and Unicode

Total characters454539
Distinct characters46
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCarrington
2nd rowCarrington
3rd rowCarrington
4th rowCarrington
5th rowGolden

Common Values

ValueCountFrequency (%)
Hermanos1839
 
3.0%
Ebony1729
 
2.9%
Tell Tale1728
 
2.9%
Tri-State1633
 
2.7%
High Top1592
 
2.6%
Horatio1436
 
2.4%
Nationeel1425
 
2.4%
Fast1405
 
2.3%
Fort West1349
 
2.2%
Sunset1331
 
2.2%
Other values (101)44961
74.4%

Length

2024-09-15T12:57:35.834792image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
high2805
 
3.6%
best2637
 
3.4%
top1961
 
2.5%
hermanos1839
 
2.4%
red1831
 
2.3%
better1735
 
2.2%
ebony1729
 
2.2%
tale1728
 
2.2%
tell1728
 
2.2%
tri-state1633
 
2.1%
Other values (115)58415
74.9%

Most occurring characters

ValueCountFrequency (%)
e43594
 
9.6%
a34134
 
7.5%
t33830
 
7.4%
o33500
 
7.4%
i29099
 
6.4%
l27976
 
6.2%
n26881
 
5.9%
r23548
 
5.2%
s19242
 
4.2%
17613
 
3.9%
Other values (36)165122
36.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter350038
77.0%
Uppercase Letter85255
 
18.8%
Space Separator17613
 
3.9%
Dash Punctuation1633
 
0.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e43594
12.5%
a34134
9.8%
t33830
9.7%
o33500
9.6%
i29099
8.3%
l27976
8.0%
n26881
7.7%
r23548
 
6.7%
s19242
 
5.5%
u11873
 
3.4%
Other values (13)66361
19.0%
Uppercase Letter
ValueCountFrequency (%)
B12675
14.9%
T10586
12.4%
C9703
11.4%
H6883
 
8.1%
S6806
 
8.0%
P4566
 
5.4%
F4519
 
5.3%
R4067
 
4.8%
G3730
 
4.4%
E3707
 
4.3%
Other values (11)18013
21.1%
Space Separator
ValueCountFrequency (%)
17613
100.0%
Dash Punctuation
ValueCountFrequency (%)
-1633
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin435293
95.8%
Common19246
 
4.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e43594
 
10.0%
a34134
 
7.8%
t33830
 
7.8%
o33500
 
7.7%
i29099
 
6.7%
l27976
 
6.4%
n26881
 
6.2%
r23548
 
5.4%
s19242
 
4.4%
B12675
 
2.9%
Other values (34)150814
34.6%
Common
ValueCountFrequency (%)
17613
91.5%
-1633
 
8.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII454539
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e43594
 
9.6%
a34134
 
7.5%
t33830
 
7.4%
o33500
 
7.4%
i29099
 
6.4%
l27976
 
6.2%
n26881
 
5.9%
r23548
 
5.2%
s19242
 
4.2%
17613
 
3.9%
Other values (36)165122
36.3%

SRP
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct315
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.115258489
Minimum0.5
Maximum3.98
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size472.2 KiB
2024-09-15T12:57:35.928570image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0.5
5-th percentile0.64
Q11.41
median2.13
Q32.79
95-th percentile3.75
Maximum3.98
Range3.48
Interquartile range (IQR)1.38

Descriptive statistics

Standard deviation0.9328285609
Coefficient of variation (CV)0.4409997953
Kurtosis-0.8945267185
Mean2.115258489
Median Absolute Deviation (MAD)0.69
Skewness0.1379245501
Sum127820.84
Variance0.8701691241
MonotonicityNot monotonic
2024-09-15T12:57:36.039288image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2.65585
 
1.0%
2.47558
 
0.9%
2.59465
 
0.8%
1.68453
 
0.7%
2.7414
 
0.7%
1.55411
 
0.7%
2.95405
 
0.7%
2.76394
 
0.7%
1.74386
 
0.6%
2.13382
 
0.6%
Other values (305)55975
92.6%
ValueCountFrequency (%)
0.575
 
0.1%
0.51228
0.4%
0.52105
 
0.2%
0.53324
0.5%
0.54178
0.3%
0.55238
0.4%
0.56184
0.3%
0.57347
0.6%
0.58162
0.3%
0.59188
0.3%
ValueCountFrequency (%)
3.98205
0.3%
3.97127
 
0.2%
3.96114
 
0.2%
3.95321
0.5%
3.9480
 
0.1%
3.93149
0.2%
3.9265
 
0.1%
3.91206
0.3%
3.948
 
0.1%
3.89158
0.3%

gross_weight
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct376
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.80643311
Minimum6
Maximum21.9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size472.2 KiB
2024-09-15T12:57:36.154980image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum6
5-th percentile6.96
Q19.7
median13.6
Q317.7
95-th percentile21.2
Maximum21.9
Range15.9
Interquartile range (IQR)8

Descriptive statistics

Standard deviation4.622692755
Coefficient of variation (CV)0.3348216529
Kurtosis-1.231772194
Mean13.80643311
Median Absolute Deviation (MAD)4
Skewness0.09297548278
Sum834295.14
Variance21.36928831
MonotonicityNot monotonic
2024-09-15T12:57:36.262677image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
17.1704
 
1.2%
19.9624
 
1.0%
14.7621
 
1.0%
17.2588
 
1.0%
20.9563
 
0.9%
13.7558
 
0.9%
13.2551
 
0.9%
16.1551
 
0.9%
14.5542
 
0.9%
18.7535
 
0.9%
Other values (366)54591
90.3%
ValueCountFrequency (%)
642
0.1%
6.0332
0.1%
6.0440
0.1%
6.0651
0.1%
6.0933
0.1%
6.1178
0.1%
6.1239
0.1%
6.1341
0.1%
6.1467
0.1%
6.1540
0.1%
ValueCountFrequency (%)
21.9500
0.8%
21.8530
0.9%
21.7509
0.8%
21.6315
0.5%
21.5299
0.5%
21.4293
0.5%
21.3453
0.7%
21.2497
0.8%
21.1293
0.5%
21407
0.7%

net_weight
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct332
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.79628914
Minimum3.05
Maximum20.8
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size472.2 KiB
2024-09-15T12:57:36.382760image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum3.05
5-th percentile4.94
Q17.71
median11.6
Q316
95-th percentile19.2
Maximum20.8
Range17.75
Interquartile range (IQR)8.29

Descriptive statistics

Standard deviation4.682986189
Coefficient of variation (CV)0.3969880811
Kurtosis-1.193367589
Mean11.79628914
Median Absolute Deviation (MAD)4.1
Skewness0.1066776009
Sum712826.16
Variance21.93035964
MonotonicityNot monotonic
2024-09-15T12:57:36.491498image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
16.21001
 
1.7%
11.6951
 
1.6%
10.6904
 
1.5%
11.1814
 
1.3%
16.7809
 
1.3%
18.7778
 
1.3%
19.7777
 
1.3%
15.2774
 
1.3%
11.3769
 
1.3%
16.6717
 
1.2%
Other values (322)52134
86.3%
ValueCountFrequency (%)
3.0551
0.1%
3.0933
0.1%
3.1175
0.1%
3.1336
0.1%
3.2836
0.1%
3.344
0.1%
3.3838
0.1%
3.432
0.1%
3.4231
0.1%
3.5932
0.1%
ValueCountFrequency (%)
20.889
 
0.1%
20.7251
 
0.4%
20.6149
 
0.2%
20.533
 
0.1%
20.374
 
0.1%
20.2307
 
0.5%
20.1192
 
0.3%
20139
 
0.2%
19.8294
 
0.5%
19.7777
1.3%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size472.2 KiB
1.0
33759 
0.0
26669 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters181284
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row0.0

Common Values

ValueCountFrequency (%)
1.033759
55.9%
0.026669
44.1%

Length

2024-09-15T12:57:36.593834image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2024-09-15T12:57:36.675290image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
1.033759
55.9%
0.026669
44.1%

Most occurring characters

ValueCountFrequency (%)
087097
48.0%
.60428
33.3%
133759
 
18.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number120856
66.7%
Other Punctuation60428
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
087097
72.1%
133759
 
27.9%
Other Punctuation
ValueCountFrequency (%)
.60428
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common181284
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
087097
48.0%
.60428
33.3%
133759
 
18.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII181284
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
087097
48.0%
.60428
33.3%
133759
 
18.6%

low_fat
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size472.2 KiB
0.0
39252 
1.0
21176 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters181284
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row1.0

Common Values

ValueCountFrequency (%)
0.039252
65.0%
1.021176
35.0%

Length

2024-09-15T12:57:36.747072image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2024-09-15T12:57:36.827724image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
0.039252
65.0%
1.021176
35.0%

Most occurring characters

ValueCountFrequency (%)
099680
55.0%
.60428
33.3%
121176
 
11.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number120856
66.7%
Other Punctuation60428
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
099680
82.5%
121176
 
17.5%
Other Punctuation
ValueCountFrequency (%)
.60428
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common181284
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
099680
55.0%
.60428
33.3%
121176
 
11.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII181284
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
099680
55.0%
.60428
33.3%
121176
 
11.7%

units_per_case
Real number (ℝ≥0)

Distinct36
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18.86069372
Minimum1
Maximum36
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size472.2 KiB
2024-09-15T12:57:36.905076image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q110
median19
Q328
95-th percentile34
Maximum36
Range35
Interquartile range (IQR)18

Descriptive statistics

Standard deviation10.25855474
Coefficient of variation (CV)0.543911846
Kurtosis-1.251866292
Mean18.86069372
Median Absolute Deviation (MAD)9
Skewness-0.08362695217
Sum1139714
Variance105.2379453
MonotonicityNot monotonic
2024-09-15T12:57:36.994868image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=36)
ValueCountFrequency (%)
292294
 
3.8%
62285
 
3.8%
332217
 
3.7%
312107
 
3.5%
232085
 
3.5%
302073
 
3.4%
262013
 
3.3%
251993
 
3.3%
51958
 
3.2%
91950
 
3.2%
Other values (26)39453
65.3%
ValueCountFrequency (%)
1952
1.6%
21558
2.6%
31877
3.1%
41573
2.6%
51958
3.2%
62285
3.8%
71480
2.4%
81364
2.3%
91950
3.2%
101387
2.3%
ValueCountFrequency (%)
36749
 
1.2%
351577
2.6%
341740
2.9%
332217
3.7%
321607
2.7%
312107
3.5%
302073
3.4%
292294
3.8%
281536
2.5%
271632
2.7%

store_type
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size472.2 KiB
Supermarket
26192 
Deluxe Supermarket
22954 
Gourmet Supermarket
6503 
Mid-Size Grocery
2846 
Small Grocery
 
1933

Length

Max length19
Median length18
Mean length14.81938836
Min length11

Characters and Unicode

Total characters895506
Distinct characters22
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowDeluxe Supermarket
2nd rowDeluxe Supermarket
3rd rowDeluxe Supermarket
4th rowDeluxe Supermarket
5th rowDeluxe Supermarket

Common Values

ValueCountFrequency (%)
Supermarket26192
43.3%
Deluxe Supermarket22954
38.0%
Gourmet Supermarket6503
 
10.8%
Mid-Size Grocery2846
 
4.7%
Small Grocery1933
 
3.2%

Length

2024-09-15T12:57:37.095557image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2024-09-15T12:57:37.192832image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
supermarket55649
58.8%
deluxe22954
24.2%
gourmet6503
 
6.9%
grocery4779
 
5.0%
mid-size2846
 
3.0%
small1933
 
2.0%

Most occurring characters

ValueCountFrequency (%)
e171334
19.1%
r127359
14.2%
u85106
9.5%
m64085
 
7.2%
t62152
 
6.9%
S60428
 
6.7%
a57582
 
6.4%
p55649
 
6.2%
k55649
 
6.2%
34236
 
3.8%
Other values (12)121926
13.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter760914
85.0%
Uppercase Letter97510
 
10.9%
Space Separator34236
 
3.8%
Dash Punctuation2846
 
0.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e171334
22.5%
r127359
16.7%
u85106
11.2%
m64085
 
8.4%
t62152
 
8.2%
a57582
 
7.6%
p55649
 
7.3%
k55649
 
7.3%
l26820
 
3.5%
x22954
 
3.0%
Other values (6)32224
 
4.2%
Uppercase Letter
ValueCountFrequency (%)
S60428
62.0%
D22954
 
23.5%
G11282
 
11.6%
M2846
 
2.9%
Space Separator
ValueCountFrequency (%)
34236
100.0%
Dash Punctuation
ValueCountFrequency (%)
-2846
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin858424
95.9%
Common37082
 
4.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e171334
20.0%
r127359
14.8%
u85106
9.9%
m64085
 
7.5%
t62152
 
7.2%
S60428
 
7.0%
a57582
 
6.7%
p55649
 
6.5%
k55649
 
6.5%
l26820
 
3.1%
Other values (10)92260
10.7%
Common
ValueCountFrequency (%)
34236
92.3%
-2846
 
7.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII895506
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e171334
19.1%
r127359
14.2%
u85106
9.5%
m64085
 
7.2%
t62152
 
6.9%
S60428
 
6.7%
a57582
 
6.4%
p55649
 
6.2%
k55649
 
6.2%
34236
 
3.8%
Other values (12)121926
13.6%

store_city
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct19
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size472.2 KiB
Tacoma
5704 
Salem
5478 
Portland
5150 
Seattle
5051 
Hidalgo
4761 
Other values (14)
34284 

Length

Max length13
Median length10
Mean length7.904564109
Min length5

Characters and Unicode

Total characters477657
Distinct characters37
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSalem
2nd rowSalem
3rd rowSalem
4th rowSalem
5th rowSalem

Common Values

ValueCountFrequency (%)
Tacoma5704
9.4%
Salem5478
 
9.1%
Portland5150
 
8.5%
Seattle5051
 
8.4%
Hidalgo4761
 
7.9%
Merida4498
 
7.4%
Spokane4453
 
7.4%
Beverly Hills4151
 
6.9%
Los Angeles3960
 
6.6%
Bremerton3451
 
5.7%
Other values (9)13771
22.8%

Length

2024-09-15T12:57:37.293866image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
tacoma5704
 
8.1%
salem5478
 
7.7%
portland5150
 
7.3%
seattle5051
 
7.1%
hidalgo4761
 
6.7%
merida4498
 
6.4%
spokane4453
 
6.3%
beverly4151
 
5.9%
hills4151
 
5.9%
los3960
 
5.6%
Other values (13)23360
33.0%

Most occurring characters

ValueCountFrequency (%)
a60687
 
12.7%
e53145
 
11.1%
l40220
 
8.4%
o37479
 
7.8%
r28508
 
6.0%
n22675
 
4.7%
i21475
 
4.5%
t20678
 
4.3%
c17993
 
3.8%
m17696
 
3.7%
Other values (27)157101
32.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter396651
83.0%
Uppercase Letter70717
 
14.8%
Space Separator10289
 
2.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a60687
15.3%
e53145
13.4%
l40220
10.1%
o37479
9.4%
r28508
 
7.2%
n22675
 
5.7%
i21475
 
5.4%
t20678
 
5.2%
c17993
 
4.5%
m17696
 
4.5%
Other values (13)76095
19.2%
Uppercase Letter
ValueCountFrequency (%)
S15765
22.3%
H8912
12.6%
B8313
11.8%
M5893
 
8.3%
T5704
 
8.1%
A5466
 
7.7%
P5150
 
7.3%
V3964
 
5.6%
L3960
 
5.6%
C3747
 
5.3%
Other values (3)3843
 
5.4%
Space Separator
ValueCountFrequency (%)
10289
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin467368
97.8%
Common10289
 
2.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a60687
13.0%
e53145
 
11.4%
l40220
 
8.6%
o37479
 
8.0%
r28508
 
6.1%
n22675
 
4.9%
i21475
 
4.6%
t20678
 
4.4%
c17993
 
3.8%
m17696
 
3.8%
Other values (26)146812
31.4%
Common
ValueCountFrequency (%)
10289
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII477657
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a60687
 
12.7%
e53145
 
11.1%
l40220
 
8.4%
o37479
 
7.8%
r28508
 
6.0%
n22675
 
4.7%
i21475
 
4.5%
t20678
 
4.3%
c17993
 
3.8%
m17696
 
3.7%
Other values (27)157101
32.9%

store_state
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size472.2 KiB
WA
19370 
OR
10628 
CA
8894 
Zacatecas
7113 
Yucatan
4498 
Other values (5)
9925 

Length

Max length9
Median length2
Mean length3.642251936
Min length2

Characters and Unicode

Total characters220094
Distinct characters25
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowOR
2nd rowOR
3rd rowOR
4th rowOR
5th rowOR

Common Values

ValueCountFrequency (%)
WA19370
32.1%
OR10628
17.6%
CA8894
14.7%
Zacatecas7113
 
11.8%
Yucatan4498
 
7.4%
BC3964
 
6.6%
Veracruz2621
 
4.3%
Guerrero1506
 
2.5%
DF1395
 
2.3%
Jalisco439
 
0.7%

Length

2024-09-15T12:57:37.700798image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2024-09-15T12:57:37.808512image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
wa19370
32.1%
or10628
17.6%
ca8894
14.7%
zacatecas7113
 
11.8%
yucatan4498
 
7.4%
bc3964
 
6.6%
veracruz2621
 
4.3%
guerrero1506
 
2.5%
df1395
 
2.3%
jalisco439
 
0.7%

Most occurring characters

ValueCountFrequency (%)
a33395
15.2%
A28264
12.8%
c21784
9.9%
W19370
 
8.8%
C12858
 
5.8%
e12746
 
5.8%
t11611
 
5.3%
O10628
 
4.8%
R10628
 
4.8%
r9760
 
4.4%
Other values (15)49050
22.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter115415
52.4%
Uppercase Letter104679
47.6%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A28264
27.0%
W19370
18.5%
C12858
12.3%
O10628
 
10.2%
R10628
 
10.2%
Z7113
 
6.8%
Y4498
 
4.3%
B3964
 
3.8%
V2621
 
2.5%
G1506
 
1.4%
Other values (3)3229
 
3.1%
Lowercase Letter
ValueCountFrequency (%)
a33395
28.9%
c21784
18.9%
e12746
 
11.0%
t11611
 
10.1%
r9760
 
8.5%
u8625
 
7.5%
s7552
 
6.5%
n4498
 
3.9%
z2621
 
2.3%
o1945
 
1.7%
Other values (2)878
 
0.8%

Most occurring scripts

ValueCountFrequency (%)
Latin220094
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a33395
15.2%
A28264
12.8%
c21784
9.9%
W19370
 
8.8%
C12858
 
5.8%
e12746
 
5.8%
t11611
 
5.3%
O10628
 
4.8%
R10628
 
4.8%
r9760
 
4.4%
Other values (15)49050
22.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII220094
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a33395
15.2%
A28264
12.8%
c21784
9.9%
W19370
 
8.8%
C12858
 
5.8%
e12746
 
5.8%
t11611
 
5.3%
O10628
 
4.8%
R10628
 
4.8%
r9760
 
4.4%
Other values (15)49050
22.3%

store_sqft
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct20
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean27988.47749
Minimum20319
Maximum39696
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size472.2 KiB
2024-09-15T12:57:37.915488image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum20319
5-th percentile20319
Q123593
median27694
Q330797
95-th percentile39696
Maximum39696
Range19377
Interquartile range (IQR)7204

Descriptive statistics

Standard deviation5701.02209
Coefficient of variation (CV)0.2036917546
Kurtosis-0.9371595153
Mean27988.47749
Median Absolute Deviation (MAD)4101
Skewness0.3866785371
Sum1691287718
Variance32501652.87
MonotonicityNot monotonic
2024-09-15T12:57:38.000312image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
338585704
 
9.4%
276945478
 
9.1%
203195150
 
8.5%
212155051
 
8.4%
307974498
 
7.4%
302684453
 
7.4%
236884151
 
6.9%
235983960
 
6.6%
305843890
 
6.4%
396963451
 
5.7%
Other values (10)14642
24.2%
ValueCountFrequency (%)
203195150
8.5%
212155051
8.4%
22478783
 
1.3%
231123384
5.6%
235931506
 
2.5%
235983960
6.6%
236884151
6.9%
237592352
3.9%
24597439
 
0.7%
276945478
9.1%
ValueCountFrequency (%)
396963451
5.7%
38382871
 
1.4%
365091395
 
2.3%
347912621
4.3%
34452580
 
1.0%
338585704
9.4%
307974498
7.4%
305843890
6.4%
302684453
7.4%
28206711
 
1.2%

grocery_sqft
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct20
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean19133.7997
Minimum13305
Maximum30351
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size472.2 KiB
2024-09-15T12:57:38.091048image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum13305
5-th percentile13305
Q116232
median18670
Q322123
95-th percentile26354
Maximum30351
Range17046
Interquartile range (IQR)5891

Descriptive statistics

Standard deviation3987.395735
Coefficient of variation (CV)0.2083953944
Kurtosis-0.5421637949
Mean19133.7997
Median Absolute Deviation (MAD)3333
Skewness0.3853139027
Sum1156217248
Variance15899324.74
MonotonicityNot monotonic
2024-09-15T12:57:38.177661image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
221235704
 
9.4%
186705478
 
9.1%
162325150
 
8.5%
133055051
 
8.4%
201414498
 
7.4%
220634453
 
7.4%
153374151
 
6.9%
142103960
 
6.6%
219383890
 
6.4%
243903451
 
5.7%
Other values (10)14642
24.2%
ValueCountFrequency (%)
133055051
8.4%
142103960
6.6%
15012439
 
0.7%
15321783
 
1.3%
153374151
6.9%
162325150
8.5%
164183384
5.6%
168442352
3.9%
174751506
 
2.5%
186705478
9.1%
ValueCountFrequency (%)
30351871
 
1.4%
27463580
 
1.0%
263542621
4.3%
243903451
5.7%
224501395
 
2.3%
22271711
 
1.2%
221235704
9.4%
220634453
7.4%
219383890
6.4%
201414498
7.4%

frozen_sqft
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct20
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5312.852552
Minimum2452
Maximum9184
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size472.2 KiB
2024-09-15T12:57:38.266459image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum2452
5-th percentile2452
Q14746
median5062
Q35751
95-th percentile9184
Maximum9184
Range6732
Interquartile range (IQR)1005

Descriptive statistics

Standard deviation1575.907263
Coefficient of variation (CV)0.2966216825
Kurtosis0.6052844351
Mean5312.852552
Median Absolute Deviation (MAD)571
Skewness0.5610409561
Sum321045054
Variance2483483.701
MonotonicityNot monotonic
2024-09-15T12:57:38.342662image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
70415704
 
9.4%
54155478
 
9.1%
24525150
 
8.5%
47465051
 
8.4%
63934498
 
7.4%
49234453
 
7.4%
50114151
 
6.9%
56333960
 
6.6%
51883890
 
6.4%
91843451
 
5.7%
Other values (10)14642
24.2%
ValueCountFrequency (%)
24525150
8.5%
3561711
 
1.2%
36711506
 
2.5%
40163384
5.6%
41492352
3.9%
4193580
 
1.0%
4294783
 
1.3%
47465051
8.4%
4819871
 
1.4%
49234453
7.4%
ValueCountFrequency (%)
91843451
5.7%
84351395
 
2.3%
70415704
9.4%
63934498
7.4%
5751439
 
0.7%
56333960
6.6%
54155478
9.1%
51883890
6.4%
50622621
4.3%
50114151
6.9%

meat_sqft
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct20
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3541.84628
Minimum1635
Maximum6122
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size472.2 KiB
2024-09-15T12:57:38.436391image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1635
5-th percentile1635
Q13164
median3375
Q33834
95-th percentile6122
Maximum6122
Range4487
Interquartile range (IQR)670

Descriptive statistics

Standard deviation1050.471635
Coefficient of variation (CV)0.2965887145
Kurtosis0.6048480774
Mean3541.84628
Median Absolute Deviation (MAD)380
Skewness0.5612377606
Sum214026687
Variance1103490.656
MonotonicityNot monotonic
2024-09-15T12:57:38.515155image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
46945704
 
9.4%
36105478
 
9.1%
16355150
 
8.5%
31645051
 
8.4%
42624498
 
7.4%
32824453
 
7.4%
33404151
 
6.9%
37553960
 
6.6%
34583890
 
6.4%
61223451
 
5.7%
Other values (10)14642
24.2%
ValueCountFrequency (%)
16355150
8.5%
2374711
 
1.2%
24471506
 
2.5%
26783384
5.6%
27662352
3.9%
2795580
 
1.0%
2863783
 
1.3%
31645051
8.4%
3213871
 
1.4%
32824453
7.4%
ValueCountFrequency (%)
61223451
5.7%
56241395
 
2.3%
46945704
9.4%
42624498
7.4%
3834439
 
0.7%
37553960
6.6%
36105478
9.1%
34583890
6.4%
33752621
4.3%
33404151
6.9%

coffee_bar
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size472.2 KiB
1.0
37021 
0.0
23407 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters181284
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.037021
61.3%
0.023407
38.7%

Length

2024-09-15T12:57:38.595981image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2024-09-15T12:57:38.675765image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
1.037021
61.3%
0.023407
38.7%

Most occurring characters

ValueCountFrequency (%)
083835
46.2%
.60428
33.3%
137021
20.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number120856
66.7%
Other Punctuation60428
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
083835
69.4%
137021
30.6%
Other Punctuation
ValueCountFrequency (%)
.60428
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common181284
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
083835
46.2%
.60428
33.3%
137021
20.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII181284
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
083835
46.2%
.60428
33.3%
137021
20.4%

video_store
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size472.2 KiB
0.0
39027 
1.0
21401 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters181284
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
0.039027
64.6%
1.021401
35.4%

Length

2024-09-15T12:57:38.745545image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2024-09-15T12:57:38.826190image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
0.039027
64.6%
1.021401
35.4%

Most occurring characters

ValueCountFrequency (%)
099455
54.9%
.60428
33.3%
121401
 
11.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number120856
66.7%
Other Punctuation60428
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
099455
82.3%
121401
 
17.7%
Other Punctuation
ValueCountFrequency (%)
.60428
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common181284
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
099455
54.9%
.60428
33.3%
121401
 
11.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII181284
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
099455
54.9%
.60428
33.3%
121401
 
11.8%

salad_bar
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size472.2 KiB
1.0
35529 
0.0
24899 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters181284
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.035529
58.8%
0.024899
41.2%

Length

2024-09-15T12:57:38.898039image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2024-09-15T12:57:38.976829image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
1.035529
58.8%
0.024899
41.2%

Most occurring characters

ValueCountFrequency (%)
085327
47.1%
.60428
33.3%
135529
19.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number120856
66.7%
Other Punctuation60428
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
085327
70.6%
135529
29.4%
Other Punctuation
ValueCountFrequency (%)
.60428
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common181284
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
085327
47.1%
.60428
33.3%
135529
19.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII181284
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
085327
47.1%
.60428
33.3%
135529
19.6%

prepared_food
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size472.2 KiB
1.0
35529 
0.0
24899 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters181284
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.035529
58.8%
0.024899
41.2%

Length

2024-09-15T12:57:39.048596image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2024-09-15T12:57:39.130419image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
1.035529
58.8%
0.024899
41.2%

Most occurring characters

ValueCountFrequency (%)
085327
47.1%
.60428
33.3%
135529
19.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number120856
66.7%
Other Punctuation60428
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
085327
70.6%
135529
29.4%
Other Punctuation
ValueCountFrequency (%)
.60428
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common181284
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
085327
47.1%
.60428
33.3%
135529
19.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII181284
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
085327
47.1%
.60428
33.3%
135529
19.6%

florist
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size472.2 KiB
1.0
33997 
0.0
26431 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters181284
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.033997
56.3%
0.026431
43.7%

Length

2024-09-15T12:57:39.200996image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2024-09-15T12:57:39.281270image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
1.033997
56.3%
0.026431
43.7%

Most occurring characters

ValueCountFrequency (%)
086859
47.9%
.60428
33.3%
133997
 
18.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number120856
66.7%
Other Punctuation60428
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
086859
71.9%
133997
 
28.1%
Other Punctuation
ValueCountFrequency (%)
.60428
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common181284
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
086859
47.9%
.60428
33.3%
133997
 
18.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII181284
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
086859
47.9%
.60428
33.3%
133997
 
18.8%

media_type
Categorical

HIGH CORRELATION

Distinct13
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size472.2 KiB
Daily Paper, Radio
6820 
Product Attachment
5371 
Daily Paper, Radio, TV
5284 
Daily Paper
5119 
Street Handout
5069 
Other values (8)
32765 

Length

Max length23
Median length18
Mean length14.72511088
Min length2

Characters and Unicode

Total characters889809
Distinct characters33
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowDaily Paper, Radio
2nd rowDaily Paper, Radio
3rd rowDaily Paper, Radio
4th rowIn-Store Coupon
5th rowRadio

Common Values

ValueCountFrequency (%)
Daily Paper, Radio6820
11.3%
Product Attachment5371
8.9%
Daily Paper, Radio, TV5284
8.7%
Daily Paper5119
8.5%
Street Handout5069
8.4%
Radio4980
8.2%
Sunday Paper4859
8.0%
In-Store Coupon4495
7.4%
Sunday Paper, Radio4050
 
6.7%
Cash Register Handout4002
 
6.6%
Other values (3)10379
17.2%

Length

2024-09-15T12:57:39.359781image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
paper29478
20.4%
radio24480
16.9%
daily17223
11.9%
sunday12255
8.5%
tv12206
8.5%
handout9071
 
6.3%
product5371
 
3.7%
attachment5371
 
3.7%
street5069
 
3.5%
in-store4495
 
3.1%
Other values (5)19413
13.4%

Most occurring characters

ValueCountFrequency (%)
a105337
 
11.8%
84004
 
9.4%
e57486
 
6.5%
o52407
 
5.9%
d51177
 
5.8%
t49190
 
5.5%
i49162
 
5.5%
r48415
 
5.4%
n35687
 
4.0%
P34849
 
3.9%
Other values (23)322095
36.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter612047
68.8%
Uppercase Letter161133
 
18.1%
Space Separator84004
 
9.4%
Other Punctuation28130
 
3.2%
Dash Punctuation4495
 
0.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a105337
17.2%
e57486
9.4%
o52407
8.6%
d51177
8.4%
t49190
8.0%
i49162
8.0%
r48415
7.9%
n35687
 
5.8%
u34649
 
5.7%
p33973
 
5.6%
Other values (8)94564
15.5%
Uppercase Letter
ValueCountFrequency (%)
P34849
21.6%
R28482
17.7%
S21819
13.5%
D17223
10.7%
T12206
 
7.6%
V12206
 
7.6%
H9071
 
5.6%
C8497
 
5.3%
A5371
 
3.3%
I4495
 
2.8%
Other values (2)6914
 
4.3%
Space Separator
ValueCountFrequency (%)
84004
100.0%
Other Punctuation
ValueCountFrequency (%)
,28130
100.0%
Dash Punctuation
ValueCountFrequency (%)
-4495
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin773180
86.9%
Common116629
 
13.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a105337
13.6%
e57486
 
7.4%
o52407
 
6.8%
d51177
 
6.6%
t49190
 
6.4%
i49162
 
6.4%
r48415
 
6.3%
n35687
 
4.6%
P34849
 
4.5%
u34649
 
4.5%
Other values (20)254821
33.0%
Common
ValueCountFrequency (%)
84004
72.0%
,28130
 
24.1%
-4495
 
3.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII889809
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a105337
 
11.8%
84004
 
9.4%
e57486
 
6.5%
o52407
 
5.9%
d51177
 
5.8%
t49190
 
5.5%
i49162
 
5.5%
r48415
 
5.4%
n35687
 
4.0%
P34849
 
3.9%
Other values (23)322095
36.2%

cost
Real number (ℝ≥0)

HIGH CORRELATION

Distinct328
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean99.26236612
Minimum50.79
Maximum149.75
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size472.2 KiB
2024-09-15T12:57:39.460514image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum50.79
5-th percentile53.82
Q169.65
median98.52
Q3126.62
95-th percentile145.41
Maximum149.75
Range98.96
Interquartile range (IQR)56.97

Descriptive statistics

Standard deviation30.01125719
Coefficient of variation (CV)0.3023427544
Kurtosis-1.265830485
Mean99.26236612
Median Absolute Deviation (MAD)28.1
Skewness0.03723895084
Sum5998226.26
Variance900.6755578
MonotonicityNot monotonic
2024-09-15T12:57:39.569223image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
101.84839
 
1.4%
69.63763
 
1.3%
59.86726
 
1.2%
81.79698
 
1.2%
131.81619
 
1.0%
126.62593
 
1.0%
92.57576
 
1.0%
69.47539
 
0.9%
99.38532
 
0.9%
91.28530
 
0.9%
Other values (318)54013
89.4%
ValueCountFrequency (%)
50.79232
0.4%
51241
0.4%
51.12432
0.7%
51.1650
 
0.1%
51.27120
 
0.2%
51.47133
 
0.2%
52.06328
0.5%
52.4231
 
0.1%
52.7772
 
0.1%
52.97170
 
0.3%
ValueCountFrequency (%)
149.75149
 
0.2%
149.08394
0.7%
148.87150
 
0.2%
148.62329
0.5%
147.82344
0.6%
147.35125
 
0.2%
147.18177
 
0.3%
147.17214
0.4%
146.72529
0.9%
146.41144
 
0.2%

Interactions

2024-09-15T12:57:29.277005image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:08.355203image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:09.865247image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:11.481687image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:13.136413image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:14.617508image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:16.068456image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:17.598935image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:19.954175image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:21.495866image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:23.003936image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:24.763495image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:26.290772image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:27.844283image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:29.376740image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:08.479872image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:09.963984image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:11.585410image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:13.237113image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:14.714248image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:16.171649image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:17.702073image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:20.058879image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:21.597579image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:23.108340image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:24.867636image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:26.393214image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:27.940054image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:29.474620image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:08.596558image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:10.069324image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:11.687138image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:13.341833image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:14.815976image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:16.280358image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:17.807749image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:20.168602image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:21.703270image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:23.223874image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:24.972354image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:26.498421image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:28.041791image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:29.577299image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:08.703273image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:10.170444image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:11.790859image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:13.449936image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:14.917689image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:16.387073image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:17.917457image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:20.276584image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:21.809136image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:23.334577image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:25.080454image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:26.601146image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:28.138543image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:29.673987image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:08.802009image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:10.277119image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:11.898531image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:13.550956image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:15.016442image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:16.495682image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:18.027805image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:20.383858image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:21.912381image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:23.443552image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:25.185933image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:26.703856image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:28.237264image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:29.772723image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:08.903737image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:10.381371image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:12.034169image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:13.652774image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:15.126009image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:16.600402image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:18.136513image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:20.492570image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:22.017175image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:23.551255image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:25.292648image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:26.808085image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:28.336993image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:29.877403image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:09.014286image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:10.523661image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:12.184804image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:13.758371image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:15.232231image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:16.713678image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:18.250130image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:20.603734image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:22.126882image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:23.663954image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:25.404125image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:26.917774image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:28.445704image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:29.984118image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:09.129022image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:10.680243image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:12.338354image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:13.865087image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:15.341400image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:16.831364image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:19.183633image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:20.717430image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:22.240597image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:23.778647image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:25.518003image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:27.058377image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:28.552417image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:30.090524image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:09.238687image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:10.816879image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:12.470039image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:13.971382image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:15.447149image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:16.943192image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:19.298360image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:20.830140image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:22.355267image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:23.892231image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:25.629705image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:27.201016image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:28.658134image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:30.193444image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:09.344622image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:10.928639image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:12.576758image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:14.084891image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:15.552800image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:17.053919image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:19.410054image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:20.942135image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:22.465959image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:24.007523image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:25.739411image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:27.314729image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:28.761882image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:30.304164image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:09.456814image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:11.048279image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:12.688422image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:14.203614image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:15.661510image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:17.170547image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:19.526839image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:21.058842image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:22.579672image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:24.313695image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:25.854105image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:27.427386image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:28.868639image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:30.411869image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:09.563415image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:11.164506image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:12.802098image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:14.314318image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:15.768890image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:17.282778image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:19.638540image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:21.173712image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:22.691357image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:24.429386image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:25.969067image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:27.539127image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:28.973358image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:30.745017image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:09.672762image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:11.274241image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:12.915807image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:14.418044image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:15.870944image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:17.392484image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:19.747249image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:21.284433image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:22.800081image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:24.551025image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:26.079322image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:27.643848image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:29.078066image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:30.841763image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:09.768525image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:11.378961image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:13.032424image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:14.517733image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:15.968742image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:17.495965image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:19.849474image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:21.391146image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:22.903204image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:24.657765image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:26.187007image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:27.742583image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2024-09-15T12:57:29.172785image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2024-09-15T12:57:39.678930image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2024-09-15T12:57:39.903446image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2024-09-15T12:57:40.128848image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2024-09-15T12:57:40.357233image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2024-09-15T12:57:40.608560image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2024-09-15T12:57:31.064168image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2024-09-15T12:57:32.164366image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

food_categoryfood_departmentfood_familystore_sales(in millions)store_cost(in millions)unit_sales(in millions)promotion_namesales_countrymarital_statusgendertotal_childreneducationmember_cardoccupationhouseowneravg_cars_at home(approx)avg. yearly_incomenum_children_at_homeavg_cars_at home(approx).1brand_nameSRPgross_weightnet_weightrecyclable_packagelow_fatunits_per_casestore_typestore_citystore_statestore_sqftgrocery_sqftfrozen_sqftmeat_sqftcoffee_barvideo_storesalad_barprepared_foodfloristmedia_typecost
0Breakfast FoodsFrozen FoodsFood7.362.72324.0Bag StuffersUSAMF1.0Partial High SchoolNormalSkilled ManualY1.0$10K - $30K1.01.0Carrington1.8419.7017.701.00.017.0Deluxe SupermarketSalemOR27694.018670.05415.03610.01.01.01.01.01.0Daily Paper, Radio126.62
1Breakfast FoodsFrozen FoodsFood5.522.59443.0Cash Register LotteryUSAMM0.0Bachelors DegreeSilverProfessionalY4.0$50K - $70K0.04.0Carrington1.8419.7017.701.00.017.0Deluxe SupermarketSalemOR27694.018670.05415.03610.01.01.01.01.01.0Daily Paper, Radio59.86
2Breakfast FoodsFrozen FoodsFood3.681.36162.0High Roller SavingsUSASF4.0Partial High SchoolNormalManualN1.0$10K - $30K0.01.0Carrington1.8419.7017.701.00.017.0Deluxe SupermarketSalemOR27694.018670.05415.03610.01.01.01.01.01.0Daily Paper, Radio84.16
3Breakfast FoodsFrozen FoodsFood3.681.17762.0Cash Register LotteryUSAMF2.0High School DegreeBronzeManualY2.0$30K - $50K2.02.0Carrington1.8419.7017.701.00.017.0Deluxe SupermarketSalemOR27694.018670.05415.03610.01.01.01.01.01.0In-Store Coupon95.78
4Breakfast FoodsFrozen FoodsFood4.081.42803.0Double Down SaleUSAMM0.0Partial High SchoolBronzeSkilled ManualN2.0$30K - $50K0.02.0Golden1.367.125.110.01.029.0Deluxe SupermarketSalemOR27694.018670.05415.03610.01.01.01.01.01.0Radio50.79
5Breakfast FoodsFrozen FoodsFood4.081.46883.0Double Down SaleUSAMF2.0Bachelors DegreeBronzeProfessionalN1.0$50K - $70K2.01.0Golden1.367.125.110.01.029.0Deluxe SupermarketSalemOR27694.018670.05415.03610.01.01.01.01.01.0Radio50.79
6Breakfast FoodsFrozen FoodsFood5.442.55684.0Cash Register LotteryUSASF4.0High School DegreeBronzeSkilled ManualN2.0$30K - $50K0.02.0Golden1.367.125.110.01.029.0Deluxe SupermarketSalemOR27694.018670.05415.03610.01.01.01.01.01.0In-Store Coupon95.78
7Breakfast FoodsFrozen FoodsFood3.741.60822.0Cash Register LotteryUSASM1.0Partial High SchoolBronzeManualY4.0$50K - $70K0.04.0Imagine1.8716.7014.701.01.010.0Deluxe SupermarketSalemOR27694.018670.05415.03610.01.01.01.01.01.0Daily Paper, Radio59.86
8Breakfast FoodsFrozen FoodsFood4.081.46883.0Cash Register LotteryUSASF2.0Partial High SchoolNormalSkilled ManualN2.0$10K - $30K0.02.0Golden1.367.125.110.01.029.0Deluxe SupermarketSalemOR27694.018670.05415.03610.01.01.01.01.01.0Daily Paper, Radio59.86
9Breakfast FoodsFrozen FoodsFood9.724.56843.0High Roller SavingsUSASF3.0Graduate DegreeBronzeProfessionalN1.0$70K - $90K0.01.0Big Time3.2416.3014.201.00.025.0Deluxe SupermarketSalemOR27694.018670.05415.03610.01.01.01.01.01.0Daily Paper, Radio84.16

Last rows

food_categoryfood_departmentfood_familystore_sales(in millions)store_cost(in millions)unit_sales(in millions)promotion_namesales_countrymarital_statusgendertotal_childreneducationmember_cardoccupationhouseowneravg_cars_at home(approx)avg. yearly_incomenum_children_at_homeavg_cars_at home(approx).1brand_nameSRPgross_weightnet_weightrecyclable_packagelow_fatunits_per_casestore_typestore_citystore_statestore_sqftgrocery_sqftfrozen_sqftmeat_sqftcoffee_barvideo_storesalad_barprepared_foodfloristmedia_typecost
60418SpecialtyCarouselNon-Consumable8.282.73243.0Save-It SaleMexicoMM2.0Partial CollegeGoldenManagementN4.0$30K - $50K1.04.0ADJ2.7619.618.61.00.026.0SupermarketAcapulcoGuerrero23593.017475.03671.02447.00.00.00.00.00.0In-Store Coupon67.63
60419SpecialtyCarouselNon-Consumable6.902.82903.0Two Day SaleMexicoSM4.0Bachelors DegreeGoldenProfessionalN2.0$70K - $90K0.02.0Prelude2.3021.519.51.00.029.0SupermarketAcapulcoGuerrero23593.017475.03671.02447.00.00.00.00.00.0Radio73.27
60420SpecialtyCarouselNon-Consumable4.841.69404.0Price WinnersMexicoSF1.0Partial High SchoolNormalSkilled ManualY1.0$10K - $30K0.01.0Toretti1.2118.915.80.00.026.0SupermarketAcapulcoGuerrero23593.017475.03671.02447.00.00.00.00.00.0Sunday Paper112.19
60421SpecialtyCarouselNon-Consumable0.990.45541.0Green Light SpecialUSASF2.0High School DegreeBronzeProfessionalY3.0$130K - $150K0.03.0King0.9911.710.61.00.025.0Small GrocerySan FranciscoCA22478.015321.04294.02863.01.00.00.00.00.0Cash Register Handout127.19
60422SpecialtyCarouselNon-Consumable1.210.44771.0Unbeatable Price SaversUSASF1.0Partial High SchoolBronzeSkilled ManualN2.0$50K - $70K0.02.0Toretti1.2118.915.80.00.026.0Small GrocerySan FranciscoCA22478.015321.04294.02863.01.00.00.00.00.0Sunday Paper, Radio78.45
60423SpecialtyCarouselNon-Consumable2.761.32481.0You Save DaysUSAMF1.0Partial High SchoolNormalSkilled ManualY1.0$10K - $30K1.01.0ADJ2.7619.618.61.00.026.0Small GrocerySan FranciscoCA22478.015321.04294.02863.01.00.00.00.00.0In-Store Coupon95.25
60424SpecialtyCarouselNon-Consumable1.600.49601.0Price CuttersUSASF2.0High School DegreeBronzeSkilled ManualN2.0$30K - $50K0.02.0Symphony1.6017.415.31.00.036.0Small GrocerySan FranciscoCA22478.015321.04294.02863.01.00.00.00.00.0Sunday Paper69.42
60425SpecialtyCarouselNon-Consumable5.522.53922.0Weekend MarkdownUSAMM1.0High School DegreeBronzeManualY3.0$30K - $50K0.03.0ADJ2.7619.618.61.00.026.0Small GrocerySan FranciscoCA22478.015321.04294.02863.01.00.00.00.00.0Sunday Paper, Radio, TV67.51
60426SpecialtyCarouselNon-Consumable8.282.56683.0Sales DaysCanadaSM2.0Bachelors DegreeBronzeProfessionalN4.0$70K - $90K0.04.0ADJ2.7619.618.61.00.026.0Mid-Size GroceryVictoriaBC34452.027463.04193.02795.01.00.00.00.01.0Sunday Paper132.88
60427SpecialtyCarouselNon-Consumable9.204.23204.0Super Duper SaversCanadaSF3.0Partial High SchoolBronzeManualY1.0$10K - $30K0.01.0Prelude2.3021.519.51.00.029.0Mid-Size GroceryVictoriaBC34452.027463.04193.02795.01.00.00.00.01.0Daily Paper, Radio87.76